Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
124 tokens/sec
GPT-4o
8 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Gaussian Process Framework for Solving Forward and Inverse Problems Involving Nonlinear Partial Differential Equations (2401.03492v2)

Published 7 Jan 2024 in cs.LG, cs.NA, and math.NA

Abstract: Physics-informed machine learning (PIML) has emerged as a promising alternative to conventional numerical methods for solving partial differential equations (PDEs). PIML models are increasingly built via deep neural networks (NNs) whose architecture and training process are designed such that the network satisfies the PDE system. While such PIML models have substantially advanced over the past few years, their performance is still very sensitive to the NN's architecture and loss function. Motivated by this limitation, we introduce kernel-weighted Corrective Residuals (CoRes) to integrate the strengths of kernel methods and deep NNs for solving nonlinear PDE systems. To achieve this integration, we design a modular and robust framework which consistently outperforms competing methods in solving a broad range of benchmark problems. This performance improvement has a theoretical justification and is particularly attractive since we simplify the training process while negligibly increasing the inference costs. Additionally, our studies on solving multiple PDEs indicate that kernel-weighted CoRes considerably decrease the sensitivity of NNs to factors such as random initialization, architecture type, and choice of optimizer. We believe our findings have the potential to spark a renewed interest in leveraging kernel methods for solving PDEs.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (60)
  1. Schaeffer, H., Caflisch, R., Hauck, C.D., Osher, S.: Sparse dynamics for partial differential equations. Proceedings of the National Academy of Sciences 110(17), 6634–6639 (2013) Mozaffar et al. [2019] Mozaffar, M., Bostanabad, R., Chen, W., Ehmann, K., Cao, J., Bessa, M.A.: Deep learning predicts path-dependent plasticity. Proc Natl Acad Sci U S A 116(52), 26414–26420 (2019) https://doi.org/10.1073/pnas.1911815116 Rahimi-Aghdam et al. [2019] Rahimi-Aghdam, S., Chau, V.T., Lee, H., Nguyen, H., Li, W., Karra, S., Rougier, E., Viswanathan, H., Srinivasan, G., Bazant, Z.P.: Branching of hydraulic cracks enabling permeability of gas or oil shale with closed natural fractures. Proc Natl Acad Sci U S A 116(5), 1532–1537 (2019) https://doi.org/10.1073/pnas.1818529116 Bar-Sinai et al. [2019] Bar-Sinai, Y., Hoyer, S., Hickey, J., Brenner, M.P.: Learning data-driven discretizations for partial differential equations. Proc Natl Acad Sci U S A 116(31), 15344–15349 (2019) https://doi.org/10.1073/pnas.1814058116 Rasp et al. [2018] Rasp, S., Pritchard, M.S., Gentine, P.: Deep learning to represent subgrid processes in climate models. Proc Natl Acad Sci U S A 115(39), 9684–9689 (2018) https://doi.org/10.1073/pnas.1810286115 Santolini and Barabási [2018] Santolini, M., Barabási, A.-L.: Predicting perturbation patterns from the topology of biological networks. Proceedings of the National Academy of Sciences 115(27), 6375–6383 (2018) Lucor et al. [2022] Lucor, D., Agrawal, A., Sergent, A.: Simple computational strategies for more effective physics-informed neural networks modeling of turbulent natural convection. Journal of Computational Physics 456, 111022 (2022) Fang et al. [2023] Fang, Q., Mou, X., Li, S.: A physics-informed neural network based on mixed data sampling for solving modified diffusion equations. Scientific Reports 13(1), 2491 (2023) Jagtap et al. [2022] Jagtap, A.D., Mao, Z., Adams, N., Karniadakis, G.E.: Physics-informed neural networks for inverse problems in supersonic flows. Journal of Computational Physics 466, 111402 (2022) Pun et al. [2019] Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Mozaffar, M., Bostanabad, R., Chen, W., Ehmann, K., Cao, J., Bessa, M.A.: Deep learning predicts path-dependent plasticity. Proc Natl Acad Sci U S A 116(52), 26414–26420 (2019) https://doi.org/10.1073/pnas.1911815116 Rahimi-Aghdam et al. [2019] Rahimi-Aghdam, S., Chau, V.T., Lee, H., Nguyen, H., Li, W., Karra, S., Rougier, E., Viswanathan, H., Srinivasan, G., Bazant, Z.P.: Branching of hydraulic cracks enabling permeability of gas or oil shale with closed natural fractures. Proc Natl Acad Sci U S A 116(5), 1532–1537 (2019) https://doi.org/10.1073/pnas.1818529116 Bar-Sinai et al. [2019] Bar-Sinai, Y., Hoyer, S., Hickey, J., Brenner, M.P.: Learning data-driven discretizations for partial differential equations. Proc Natl Acad Sci U S A 116(31), 15344–15349 (2019) https://doi.org/10.1073/pnas.1814058116 Rasp et al. [2018] Rasp, S., Pritchard, M.S., Gentine, P.: Deep learning to represent subgrid processes in climate models. Proc Natl Acad Sci U S A 115(39), 9684–9689 (2018) https://doi.org/10.1073/pnas.1810286115 Santolini and Barabási [2018] Santolini, M., Barabási, A.-L.: Predicting perturbation patterns from the topology of biological networks. Proceedings of the National Academy of Sciences 115(27), 6375–6383 (2018) Lucor et al. [2022] Lucor, D., Agrawal, A., Sergent, A.: Simple computational strategies for more effective physics-informed neural networks modeling of turbulent natural convection. Journal of Computational Physics 456, 111022 (2022) Fang et al. [2023] Fang, Q., Mou, X., Li, S.: A physics-informed neural network based on mixed data sampling for solving modified diffusion equations. Scientific Reports 13(1), 2491 (2023) Jagtap et al. [2022] Jagtap, A.D., Mao, Z., Adams, N., Karniadakis, G.E.: Physics-informed neural networks for inverse problems in supersonic flows. Journal of Computational Physics 466, 111402 (2022) Pun et al. [2019] Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rahimi-Aghdam, S., Chau, V.T., Lee, H., Nguyen, H., Li, W., Karra, S., Rougier, E., Viswanathan, H., Srinivasan, G., Bazant, Z.P.: Branching of hydraulic cracks enabling permeability of gas or oil shale with closed natural fractures. Proc Natl Acad Sci U S A 116(5), 1532–1537 (2019) https://doi.org/10.1073/pnas.1818529116 Bar-Sinai et al. [2019] Bar-Sinai, Y., Hoyer, S., Hickey, J., Brenner, M.P.: Learning data-driven discretizations for partial differential equations. Proc Natl Acad Sci U S A 116(31), 15344–15349 (2019) https://doi.org/10.1073/pnas.1814058116 Rasp et al. [2018] Rasp, S., Pritchard, M.S., Gentine, P.: Deep learning to represent subgrid processes in climate models. Proc Natl Acad Sci U S A 115(39), 9684–9689 (2018) https://doi.org/10.1073/pnas.1810286115 Santolini and Barabási [2018] Santolini, M., Barabási, A.-L.: Predicting perturbation patterns from the topology of biological networks. Proceedings of the National Academy of Sciences 115(27), 6375–6383 (2018) Lucor et al. [2022] Lucor, D., Agrawal, A., Sergent, A.: Simple computational strategies for more effective physics-informed neural networks modeling of turbulent natural convection. Journal of Computational Physics 456, 111022 (2022) Fang et al. [2023] Fang, Q., Mou, X., Li, S.: A physics-informed neural network based on mixed data sampling for solving modified diffusion equations. Scientific Reports 13(1), 2491 (2023) Jagtap et al. [2022] Jagtap, A.D., Mao, Z., Adams, N., Karniadakis, G.E.: Physics-informed neural networks for inverse problems in supersonic flows. Journal of Computational Physics 466, 111402 (2022) Pun et al. [2019] Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Bar-Sinai, Y., Hoyer, S., Hickey, J., Brenner, M.P.: Learning data-driven discretizations for partial differential equations. Proc Natl Acad Sci U S A 116(31), 15344–15349 (2019) https://doi.org/10.1073/pnas.1814058116 Rasp et al. [2018] Rasp, S., Pritchard, M.S., Gentine, P.: Deep learning to represent subgrid processes in climate models. Proc Natl Acad Sci U S A 115(39), 9684–9689 (2018) https://doi.org/10.1073/pnas.1810286115 Santolini and Barabási [2018] Santolini, M., Barabási, A.-L.: Predicting perturbation patterns from the topology of biological networks. Proceedings of the National Academy of Sciences 115(27), 6375–6383 (2018) Lucor et al. [2022] Lucor, D., Agrawal, A., Sergent, A.: Simple computational strategies for more effective physics-informed neural networks modeling of turbulent natural convection. Journal of Computational Physics 456, 111022 (2022) Fang et al. [2023] Fang, Q., Mou, X., Li, S.: A physics-informed neural network based on mixed data sampling for solving modified diffusion equations. Scientific Reports 13(1), 2491 (2023) Jagtap et al. [2022] Jagtap, A.D., Mao, Z., Adams, N., Karniadakis, G.E.: Physics-informed neural networks for inverse problems in supersonic flows. Journal of Computational Physics 466, 111402 (2022) Pun et al. [2019] Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasp, S., Pritchard, M.S., Gentine, P.: Deep learning to represent subgrid processes in climate models. Proc Natl Acad Sci U S A 115(39), 9684–9689 (2018) https://doi.org/10.1073/pnas.1810286115 Santolini and Barabási [2018] Santolini, M., Barabási, A.-L.: Predicting perturbation patterns from the topology of biological networks. Proceedings of the National Academy of Sciences 115(27), 6375–6383 (2018) Lucor et al. [2022] Lucor, D., Agrawal, A., Sergent, A.: Simple computational strategies for more effective physics-informed neural networks modeling of turbulent natural convection. Journal of Computational Physics 456, 111022 (2022) Fang et al. [2023] Fang, Q., Mou, X., Li, S.: A physics-informed neural network based on mixed data sampling for solving modified diffusion equations. Scientific Reports 13(1), 2491 (2023) Jagtap et al. [2022] Jagtap, A.D., Mao, Z., Adams, N., Karniadakis, G.E.: Physics-informed neural networks for inverse problems in supersonic flows. Journal of Computational Physics 466, 111402 (2022) Pun et al. [2019] Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Santolini, M., Barabási, A.-L.: Predicting perturbation patterns from the topology of biological networks. Proceedings of the National Academy of Sciences 115(27), 6375–6383 (2018) Lucor et al. [2022] Lucor, D., Agrawal, A., Sergent, A.: Simple computational strategies for more effective physics-informed neural networks modeling of turbulent natural convection. Journal of Computational Physics 456, 111022 (2022) Fang et al. [2023] Fang, Q., Mou, X., Li, S.: A physics-informed neural network based on mixed data sampling for solving modified diffusion equations. Scientific Reports 13(1), 2491 (2023) Jagtap et al. [2022] Jagtap, A.D., Mao, Z., Adams, N., Karniadakis, G.E.: Physics-informed neural networks for inverse problems in supersonic flows. Journal of Computational Physics 466, 111402 (2022) Pun et al. [2019] Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lucor, D., Agrawal, A., Sergent, A.: Simple computational strategies for more effective physics-informed neural networks modeling of turbulent natural convection. Journal of Computational Physics 456, 111022 (2022) Fang et al. [2023] Fang, Q., Mou, X., Li, S.: A physics-informed neural network based on mixed data sampling for solving modified diffusion equations. Scientific Reports 13(1), 2491 (2023) Jagtap et al. [2022] Jagtap, A.D., Mao, Z., Adams, N., Karniadakis, G.E.: Physics-informed neural networks for inverse problems in supersonic flows. Journal of Computational Physics 466, 111402 (2022) Pun et al. [2019] Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Fang, Q., Mou, X., Li, S.: A physics-informed neural network based on mixed data sampling for solving modified diffusion equations. Scientific Reports 13(1), 2491 (2023) Jagtap et al. [2022] Jagtap, A.D., Mao, Z., Adams, N., Karniadakis, G.E.: Physics-informed neural networks for inverse problems in supersonic flows. Journal of Computational Physics 466, 111402 (2022) Pun et al. [2019] Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Mao, Z., Adams, N., Karniadakis, G.E.: Physics-informed neural networks for inverse problems in supersonic flows. Journal of Computational Physics 466, 111402 (2022) Pun et al. [2019] Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  2. Mozaffar, M., Bostanabad, R., Chen, W., Ehmann, K., Cao, J., Bessa, M.A.: Deep learning predicts path-dependent plasticity. Proc Natl Acad Sci U S A 116(52), 26414–26420 (2019) https://doi.org/10.1073/pnas.1911815116 Rahimi-Aghdam et al. [2019] Rahimi-Aghdam, S., Chau, V.T., Lee, H., Nguyen, H., Li, W., Karra, S., Rougier, E., Viswanathan, H., Srinivasan, G., Bazant, Z.P.: Branching of hydraulic cracks enabling permeability of gas or oil shale with closed natural fractures. Proc Natl Acad Sci U S A 116(5), 1532–1537 (2019) https://doi.org/10.1073/pnas.1818529116 Bar-Sinai et al. [2019] Bar-Sinai, Y., Hoyer, S., Hickey, J., Brenner, M.P.: Learning data-driven discretizations for partial differential equations. Proc Natl Acad Sci U S A 116(31), 15344–15349 (2019) https://doi.org/10.1073/pnas.1814058116 Rasp et al. [2018] Rasp, S., Pritchard, M.S., Gentine, P.: Deep learning to represent subgrid processes in climate models. Proc Natl Acad Sci U S A 115(39), 9684–9689 (2018) https://doi.org/10.1073/pnas.1810286115 Santolini and Barabási [2018] Santolini, M., Barabási, A.-L.: Predicting perturbation patterns from the topology of biological networks. Proceedings of the National Academy of Sciences 115(27), 6375–6383 (2018) Lucor et al. [2022] Lucor, D., Agrawal, A., Sergent, A.: Simple computational strategies for more effective physics-informed neural networks modeling of turbulent natural convection. Journal of Computational Physics 456, 111022 (2022) Fang et al. [2023] Fang, Q., Mou, X., Li, S.: A physics-informed neural network based on mixed data sampling for solving modified diffusion equations. Scientific Reports 13(1), 2491 (2023) Jagtap et al. [2022] Jagtap, A.D., Mao, Z., Adams, N., Karniadakis, G.E.: Physics-informed neural networks for inverse problems in supersonic flows. Journal of Computational Physics 466, 111402 (2022) Pun et al. [2019] Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rahimi-Aghdam, S., Chau, V.T., Lee, H., Nguyen, H., Li, W., Karra, S., Rougier, E., Viswanathan, H., Srinivasan, G., Bazant, Z.P.: Branching of hydraulic cracks enabling permeability of gas or oil shale with closed natural fractures. Proc Natl Acad Sci U S A 116(5), 1532–1537 (2019) https://doi.org/10.1073/pnas.1818529116 Bar-Sinai et al. [2019] Bar-Sinai, Y., Hoyer, S., Hickey, J., Brenner, M.P.: Learning data-driven discretizations for partial differential equations. Proc Natl Acad Sci U S A 116(31), 15344–15349 (2019) https://doi.org/10.1073/pnas.1814058116 Rasp et al. [2018] Rasp, S., Pritchard, M.S., Gentine, P.: Deep learning to represent subgrid processes in climate models. Proc Natl Acad Sci U S A 115(39), 9684–9689 (2018) https://doi.org/10.1073/pnas.1810286115 Santolini and Barabási [2018] Santolini, M., Barabási, A.-L.: Predicting perturbation patterns from the topology of biological networks. Proceedings of the National Academy of Sciences 115(27), 6375–6383 (2018) Lucor et al. [2022] Lucor, D., Agrawal, A., Sergent, A.: Simple computational strategies for more effective physics-informed neural networks modeling of turbulent natural convection. Journal of Computational Physics 456, 111022 (2022) Fang et al. [2023] Fang, Q., Mou, X., Li, S.: A physics-informed neural network based on mixed data sampling for solving modified diffusion equations. Scientific Reports 13(1), 2491 (2023) Jagtap et al. [2022] Jagtap, A.D., Mao, Z., Adams, N., Karniadakis, G.E.: Physics-informed neural networks for inverse problems in supersonic flows. Journal of Computational Physics 466, 111402 (2022) Pun et al. [2019] Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Bar-Sinai, Y., Hoyer, S., Hickey, J., Brenner, M.P.: Learning data-driven discretizations for partial differential equations. Proc Natl Acad Sci U S A 116(31), 15344–15349 (2019) https://doi.org/10.1073/pnas.1814058116 Rasp et al. [2018] Rasp, S., Pritchard, M.S., Gentine, P.: Deep learning to represent subgrid processes in climate models. Proc Natl Acad Sci U S A 115(39), 9684–9689 (2018) https://doi.org/10.1073/pnas.1810286115 Santolini and Barabási [2018] Santolini, M., Barabási, A.-L.: Predicting perturbation patterns from the topology of biological networks. Proceedings of the National Academy of Sciences 115(27), 6375–6383 (2018) Lucor et al. [2022] Lucor, D., Agrawal, A., Sergent, A.: Simple computational strategies for more effective physics-informed neural networks modeling of turbulent natural convection. Journal of Computational Physics 456, 111022 (2022) Fang et al. [2023] Fang, Q., Mou, X., Li, S.: A physics-informed neural network based on mixed data sampling for solving modified diffusion equations. Scientific Reports 13(1), 2491 (2023) Jagtap et al. [2022] Jagtap, A.D., Mao, Z., Adams, N., Karniadakis, G.E.: Physics-informed neural networks for inverse problems in supersonic flows. Journal of Computational Physics 466, 111402 (2022) Pun et al. [2019] Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasp, S., Pritchard, M.S., Gentine, P.: Deep learning to represent subgrid processes in climate models. Proc Natl Acad Sci U S A 115(39), 9684–9689 (2018) https://doi.org/10.1073/pnas.1810286115 Santolini and Barabási [2018] Santolini, M., Barabási, A.-L.: Predicting perturbation patterns from the topology of biological networks. Proceedings of the National Academy of Sciences 115(27), 6375–6383 (2018) Lucor et al. [2022] Lucor, D., Agrawal, A., Sergent, A.: Simple computational strategies for more effective physics-informed neural networks modeling of turbulent natural convection. Journal of Computational Physics 456, 111022 (2022) Fang et al. [2023] Fang, Q., Mou, X., Li, S.: A physics-informed neural network based on mixed data sampling for solving modified diffusion equations. Scientific Reports 13(1), 2491 (2023) Jagtap et al. [2022] Jagtap, A.D., Mao, Z., Adams, N., Karniadakis, G.E.: Physics-informed neural networks for inverse problems in supersonic flows. Journal of Computational Physics 466, 111402 (2022) Pun et al. [2019] Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Santolini, M., Barabási, A.-L.: Predicting perturbation patterns from the topology of biological networks. Proceedings of the National Academy of Sciences 115(27), 6375–6383 (2018) Lucor et al. [2022] Lucor, D., Agrawal, A., Sergent, A.: Simple computational strategies for more effective physics-informed neural networks modeling of turbulent natural convection. Journal of Computational Physics 456, 111022 (2022) Fang et al. [2023] Fang, Q., Mou, X., Li, S.: A physics-informed neural network based on mixed data sampling for solving modified diffusion equations. Scientific Reports 13(1), 2491 (2023) Jagtap et al. [2022] Jagtap, A.D., Mao, Z., Adams, N., Karniadakis, G.E.: Physics-informed neural networks for inverse problems in supersonic flows. Journal of Computational Physics 466, 111402 (2022) Pun et al. [2019] Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lucor, D., Agrawal, A., Sergent, A.: Simple computational strategies for more effective physics-informed neural networks modeling of turbulent natural convection. Journal of Computational Physics 456, 111022 (2022) Fang et al. [2023] Fang, Q., Mou, X., Li, S.: A physics-informed neural network based on mixed data sampling for solving modified diffusion equations. Scientific Reports 13(1), 2491 (2023) Jagtap et al. [2022] Jagtap, A.D., Mao, Z., Adams, N., Karniadakis, G.E.: Physics-informed neural networks for inverse problems in supersonic flows. Journal of Computational Physics 466, 111402 (2022) Pun et al. [2019] Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Fang, Q., Mou, X., Li, S.: A physics-informed neural network based on mixed data sampling for solving modified diffusion equations. Scientific Reports 13(1), 2491 (2023) Jagtap et al. [2022] Jagtap, A.D., Mao, Z., Adams, N., Karniadakis, G.E.: Physics-informed neural networks for inverse problems in supersonic flows. Journal of Computational Physics 466, 111402 (2022) Pun et al. [2019] Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Mao, Z., Adams, N., Karniadakis, G.E.: Physics-informed neural networks for inverse problems in supersonic flows. Journal of Computational Physics 466, 111402 (2022) Pun et al. [2019] Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  3. Rahimi-Aghdam, S., Chau, V.T., Lee, H., Nguyen, H., Li, W., Karra, S., Rougier, E., Viswanathan, H., Srinivasan, G., Bazant, Z.P.: Branching of hydraulic cracks enabling permeability of gas or oil shale with closed natural fractures. Proc Natl Acad Sci U S A 116(5), 1532–1537 (2019) https://doi.org/10.1073/pnas.1818529116 Bar-Sinai et al. [2019] Bar-Sinai, Y., Hoyer, S., Hickey, J., Brenner, M.P.: Learning data-driven discretizations for partial differential equations. Proc Natl Acad Sci U S A 116(31), 15344–15349 (2019) https://doi.org/10.1073/pnas.1814058116 Rasp et al. [2018] Rasp, S., Pritchard, M.S., Gentine, P.: Deep learning to represent subgrid processes in climate models. Proc Natl Acad Sci U S A 115(39), 9684–9689 (2018) https://doi.org/10.1073/pnas.1810286115 Santolini and Barabási [2018] Santolini, M., Barabási, A.-L.: Predicting perturbation patterns from the topology of biological networks. Proceedings of the National Academy of Sciences 115(27), 6375–6383 (2018) Lucor et al. [2022] Lucor, D., Agrawal, A., Sergent, A.: Simple computational strategies for more effective physics-informed neural networks modeling of turbulent natural convection. Journal of Computational Physics 456, 111022 (2022) Fang et al. [2023] Fang, Q., Mou, X., Li, S.: A physics-informed neural network based on mixed data sampling for solving modified diffusion equations. Scientific Reports 13(1), 2491 (2023) Jagtap et al. [2022] Jagtap, A.D., Mao, Z., Adams, N., Karniadakis, G.E.: Physics-informed neural networks for inverse problems in supersonic flows. Journal of Computational Physics 466, 111402 (2022) Pun et al. [2019] Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Bar-Sinai, Y., Hoyer, S., Hickey, J., Brenner, M.P.: Learning data-driven discretizations for partial differential equations. Proc Natl Acad Sci U S A 116(31), 15344–15349 (2019) https://doi.org/10.1073/pnas.1814058116 Rasp et al. [2018] Rasp, S., Pritchard, M.S., Gentine, P.: Deep learning to represent subgrid processes in climate models. Proc Natl Acad Sci U S A 115(39), 9684–9689 (2018) https://doi.org/10.1073/pnas.1810286115 Santolini and Barabási [2018] Santolini, M., Barabási, A.-L.: Predicting perturbation patterns from the topology of biological networks. Proceedings of the National Academy of Sciences 115(27), 6375–6383 (2018) Lucor et al. [2022] Lucor, D., Agrawal, A., Sergent, A.: Simple computational strategies for more effective physics-informed neural networks modeling of turbulent natural convection. Journal of Computational Physics 456, 111022 (2022) Fang et al. [2023] Fang, Q., Mou, X., Li, S.: A physics-informed neural network based on mixed data sampling for solving modified diffusion equations. Scientific Reports 13(1), 2491 (2023) Jagtap et al. [2022] Jagtap, A.D., Mao, Z., Adams, N., Karniadakis, G.E.: Physics-informed neural networks for inverse problems in supersonic flows. Journal of Computational Physics 466, 111402 (2022) Pun et al. [2019] Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasp, S., Pritchard, M.S., Gentine, P.: Deep learning to represent subgrid processes in climate models. Proc Natl Acad Sci U S A 115(39), 9684–9689 (2018) https://doi.org/10.1073/pnas.1810286115 Santolini and Barabási [2018] Santolini, M., Barabási, A.-L.: Predicting perturbation patterns from the topology of biological networks. Proceedings of the National Academy of Sciences 115(27), 6375–6383 (2018) Lucor et al. [2022] Lucor, D., Agrawal, A., Sergent, A.: Simple computational strategies for more effective physics-informed neural networks modeling of turbulent natural convection. Journal of Computational Physics 456, 111022 (2022) Fang et al. [2023] Fang, Q., Mou, X., Li, S.: A physics-informed neural network based on mixed data sampling for solving modified diffusion equations. Scientific Reports 13(1), 2491 (2023) Jagtap et al. [2022] Jagtap, A.D., Mao, Z., Adams, N., Karniadakis, G.E.: Physics-informed neural networks for inverse problems in supersonic flows. Journal of Computational Physics 466, 111402 (2022) Pun et al. [2019] Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Santolini, M., Barabási, A.-L.: Predicting perturbation patterns from the topology of biological networks. Proceedings of the National Academy of Sciences 115(27), 6375–6383 (2018) Lucor et al. [2022] Lucor, D., Agrawal, A., Sergent, A.: Simple computational strategies for more effective physics-informed neural networks modeling of turbulent natural convection. Journal of Computational Physics 456, 111022 (2022) Fang et al. [2023] Fang, Q., Mou, X., Li, S.: A physics-informed neural network based on mixed data sampling for solving modified diffusion equations. Scientific Reports 13(1), 2491 (2023) Jagtap et al. [2022] Jagtap, A.D., Mao, Z., Adams, N., Karniadakis, G.E.: Physics-informed neural networks for inverse problems in supersonic flows. Journal of Computational Physics 466, 111402 (2022) Pun et al. [2019] Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lucor, D., Agrawal, A., Sergent, A.: Simple computational strategies for more effective physics-informed neural networks modeling of turbulent natural convection. Journal of Computational Physics 456, 111022 (2022) Fang et al. [2023] Fang, Q., Mou, X., Li, S.: A physics-informed neural network based on mixed data sampling for solving modified diffusion equations. Scientific Reports 13(1), 2491 (2023) Jagtap et al. [2022] Jagtap, A.D., Mao, Z., Adams, N., Karniadakis, G.E.: Physics-informed neural networks for inverse problems in supersonic flows. Journal of Computational Physics 466, 111402 (2022) Pun et al. [2019] Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Fang, Q., Mou, X., Li, S.: A physics-informed neural network based on mixed data sampling for solving modified diffusion equations. Scientific Reports 13(1), 2491 (2023) Jagtap et al. [2022] Jagtap, A.D., Mao, Z., Adams, N., Karniadakis, G.E.: Physics-informed neural networks for inverse problems in supersonic flows. Journal of Computational Physics 466, 111402 (2022) Pun et al. [2019] Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Mao, Z., Adams, N., Karniadakis, G.E.: Physics-informed neural networks for inverse problems in supersonic flows. Journal of Computational Physics 466, 111402 (2022) Pun et al. [2019] Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  4. Bar-Sinai, Y., Hoyer, S., Hickey, J., Brenner, M.P.: Learning data-driven discretizations for partial differential equations. Proc Natl Acad Sci U S A 116(31), 15344–15349 (2019) https://doi.org/10.1073/pnas.1814058116 Rasp et al. [2018] Rasp, S., Pritchard, M.S., Gentine, P.: Deep learning to represent subgrid processes in climate models. Proc Natl Acad Sci U S A 115(39), 9684–9689 (2018) https://doi.org/10.1073/pnas.1810286115 Santolini and Barabási [2018] Santolini, M., Barabási, A.-L.: Predicting perturbation patterns from the topology of biological networks. Proceedings of the National Academy of Sciences 115(27), 6375–6383 (2018) Lucor et al. [2022] Lucor, D., Agrawal, A., Sergent, A.: Simple computational strategies for more effective physics-informed neural networks modeling of turbulent natural convection. Journal of Computational Physics 456, 111022 (2022) Fang et al. [2023] Fang, Q., Mou, X., Li, S.: A physics-informed neural network based on mixed data sampling for solving modified diffusion equations. Scientific Reports 13(1), 2491 (2023) Jagtap et al. [2022] Jagtap, A.D., Mao, Z., Adams, N., Karniadakis, G.E.: Physics-informed neural networks for inverse problems in supersonic flows. Journal of Computational Physics 466, 111402 (2022) Pun et al. [2019] Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasp, S., Pritchard, M.S., Gentine, P.: Deep learning to represent subgrid processes in climate models. Proc Natl Acad Sci U S A 115(39), 9684–9689 (2018) https://doi.org/10.1073/pnas.1810286115 Santolini and Barabási [2018] Santolini, M., Barabási, A.-L.: Predicting perturbation patterns from the topology of biological networks. Proceedings of the National Academy of Sciences 115(27), 6375–6383 (2018) Lucor et al. [2022] Lucor, D., Agrawal, A., Sergent, A.: Simple computational strategies for more effective physics-informed neural networks modeling of turbulent natural convection. Journal of Computational Physics 456, 111022 (2022) Fang et al. [2023] Fang, Q., Mou, X., Li, S.: A physics-informed neural network based on mixed data sampling for solving modified diffusion equations. Scientific Reports 13(1), 2491 (2023) Jagtap et al. [2022] Jagtap, A.D., Mao, Z., Adams, N., Karniadakis, G.E.: Physics-informed neural networks for inverse problems in supersonic flows. Journal of Computational Physics 466, 111402 (2022) Pun et al. [2019] Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Santolini, M., Barabási, A.-L.: Predicting perturbation patterns from the topology of biological networks. Proceedings of the National Academy of Sciences 115(27), 6375–6383 (2018) Lucor et al. [2022] Lucor, D., Agrawal, A., Sergent, A.: Simple computational strategies for more effective physics-informed neural networks modeling of turbulent natural convection. Journal of Computational Physics 456, 111022 (2022) Fang et al. [2023] Fang, Q., Mou, X., Li, S.: A physics-informed neural network based on mixed data sampling for solving modified diffusion equations. Scientific Reports 13(1), 2491 (2023) Jagtap et al. [2022] Jagtap, A.D., Mao, Z., Adams, N., Karniadakis, G.E.: Physics-informed neural networks for inverse problems in supersonic flows. Journal of Computational Physics 466, 111402 (2022) Pun et al. [2019] Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lucor, D., Agrawal, A., Sergent, A.: Simple computational strategies for more effective physics-informed neural networks modeling of turbulent natural convection. Journal of Computational Physics 456, 111022 (2022) Fang et al. [2023] Fang, Q., Mou, X., Li, S.: A physics-informed neural network based on mixed data sampling for solving modified diffusion equations. Scientific Reports 13(1), 2491 (2023) Jagtap et al. [2022] Jagtap, A.D., Mao, Z., Adams, N., Karniadakis, G.E.: Physics-informed neural networks for inverse problems in supersonic flows. Journal of Computational Physics 466, 111402 (2022) Pun et al. [2019] Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Fang, Q., Mou, X., Li, S.: A physics-informed neural network based on mixed data sampling for solving modified diffusion equations. Scientific Reports 13(1), 2491 (2023) Jagtap et al. [2022] Jagtap, A.D., Mao, Z., Adams, N., Karniadakis, G.E.: Physics-informed neural networks for inverse problems in supersonic flows. Journal of Computational Physics 466, 111402 (2022) Pun et al. [2019] Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Mao, Z., Adams, N., Karniadakis, G.E.: Physics-informed neural networks for inverse problems in supersonic flows. Journal of Computational Physics 466, 111402 (2022) Pun et al. [2019] Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  5. Rasp, S., Pritchard, M.S., Gentine, P.: Deep learning to represent subgrid processes in climate models. Proc Natl Acad Sci U S A 115(39), 9684–9689 (2018) https://doi.org/10.1073/pnas.1810286115 Santolini and Barabási [2018] Santolini, M., Barabási, A.-L.: Predicting perturbation patterns from the topology of biological networks. Proceedings of the National Academy of Sciences 115(27), 6375–6383 (2018) Lucor et al. [2022] Lucor, D., Agrawal, A., Sergent, A.: Simple computational strategies for more effective physics-informed neural networks modeling of turbulent natural convection. Journal of Computational Physics 456, 111022 (2022) Fang et al. [2023] Fang, Q., Mou, X., Li, S.: A physics-informed neural network based on mixed data sampling for solving modified diffusion equations. Scientific Reports 13(1), 2491 (2023) Jagtap et al. [2022] Jagtap, A.D., Mao, Z., Adams, N., Karniadakis, G.E.: Physics-informed neural networks for inverse problems in supersonic flows. Journal of Computational Physics 466, 111402 (2022) Pun et al. [2019] Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Santolini, M., Barabási, A.-L.: Predicting perturbation patterns from the topology of biological networks. Proceedings of the National Academy of Sciences 115(27), 6375–6383 (2018) Lucor et al. [2022] Lucor, D., Agrawal, A., Sergent, A.: Simple computational strategies for more effective physics-informed neural networks modeling of turbulent natural convection. Journal of Computational Physics 456, 111022 (2022) Fang et al. [2023] Fang, Q., Mou, X., Li, S.: A physics-informed neural network based on mixed data sampling for solving modified diffusion equations. Scientific Reports 13(1), 2491 (2023) Jagtap et al. [2022] Jagtap, A.D., Mao, Z., Adams, N., Karniadakis, G.E.: Physics-informed neural networks for inverse problems in supersonic flows. Journal of Computational Physics 466, 111402 (2022) Pun et al. [2019] Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lucor, D., Agrawal, A., Sergent, A.: Simple computational strategies for more effective physics-informed neural networks modeling of turbulent natural convection. Journal of Computational Physics 456, 111022 (2022) Fang et al. [2023] Fang, Q., Mou, X., Li, S.: A physics-informed neural network based on mixed data sampling for solving modified diffusion equations. Scientific Reports 13(1), 2491 (2023) Jagtap et al. [2022] Jagtap, A.D., Mao, Z., Adams, N., Karniadakis, G.E.: Physics-informed neural networks for inverse problems in supersonic flows. Journal of Computational Physics 466, 111402 (2022) Pun et al. [2019] Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Fang, Q., Mou, X., Li, S.: A physics-informed neural network based on mixed data sampling for solving modified diffusion equations. Scientific Reports 13(1), 2491 (2023) Jagtap et al. [2022] Jagtap, A.D., Mao, Z., Adams, N., Karniadakis, G.E.: Physics-informed neural networks for inverse problems in supersonic flows. Journal of Computational Physics 466, 111402 (2022) Pun et al. [2019] Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Mao, Z., Adams, N., Karniadakis, G.E.: Physics-informed neural networks for inverse problems in supersonic flows. Journal of Computational Physics 466, 111402 (2022) Pun et al. [2019] Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  6. Santolini, M., Barabási, A.-L.: Predicting perturbation patterns from the topology of biological networks. Proceedings of the National Academy of Sciences 115(27), 6375–6383 (2018) Lucor et al. [2022] Lucor, D., Agrawal, A., Sergent, A.: Simple computational strategies for more effective physics-informed neural networks modeling of turbulent natural convection. Journal of Computational Physics 456, 111022 (2022) Fang et al. [2023] Fang, Q., Mou, X., Li, S.: A physics-informed neural network based on mixed data sampling for solving modified diffusion equations. Scientific Reports 13(1), 2491 (2023) Jagtap et al. [2022] Jagtap, A.D., Mao, Z., Adams, N., Karniadakis, G.E.: Physics-informed neural networks for inverse problems in supersonic flows. Journal of Computational Physics 466, 111402 (2022) Pun et al. [2019] Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lucor, D., Agrawal, A., Sergent, A.: Simple computational strategies for more effective physics-informed neural networks modeling of turbulent natural convection. Journal of Computational Physics 456, 111022 (2022) Fang et al. [2023] Fang, Q., Mou, X., Li, S.: A physics-informed neural network based on mixed data sampling for solving modified diffusion equations. Scientific Reports 13(1), 2491 (2023) Jagtap et al. [2022] Jagtap, A.D., Mao, Z., Adams, N., Karniadakis, G.E.: Physics-informed neural networks for inverse problems in supersonic flows. Journal of Computational Physics 466, 111402 (2022) Pun et al. [2019] Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Fang, Q., Mou, X., Li, S.: A physics-informed neural network based on mixed data sampling for solving modified diffusion equations. Scientific Reports 13(1), 2491 (2023) Jagtap et al. [2022] Jagtap, A.D., Mao, Z., Adams, N., Karniadakis, G.E.: Physics-informed neural networks for inverse problems in supersonic flows. Journal of Computational Physics 466, 111402 (2022) Pun et al. [2019] Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Mao, Z., Adams, N., Karniadakis, G.E.: Physics-informed neural networks for inverse problems in supersonic flows. Journal of Computational Physics 466, 111402 (2022) Pun et al. [2019] Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  7. Lucor, D., Agrawal, A., Sergent, A.: Simple computational strategies for more effective physics-informed neural networks modeling of turbulent natural convection. Journal of Computational Physics 456, 111022 (2022) Fang et al. [2023] Fang, Q., Mou, X., Li, S.: A physics-informed neural network based on mixed data sampling for solving modified diffusion equations. Scientific Reports 13(1), 2491 (2023) Jagtap et al. [2022] Jagtap, A.D., Mao, Z., Adams, N., Karniadakis, G.E.: Physics-informed neural networks for inverse problems in supersonic flows. Journal of Computational Physics 466, 111402 (2022) Pun et al. [2019] Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Fang, Q., Mou, X., Li, S.: A physics-informed neural network based on mixed data sampling for solving modified diffusion equations. Scientific Reports 13(1), 2491 (2023) Jagtap et al. [2022] Jagtap, A.D., Mao, Z., Adams, N., Karniadakis, G.E.: Physics-informed neural networks for inverse problems in supersonic flows. Journal of Computational Physics 466, 111402 (2022) Pun et al. [2019] Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Mao, Z., Adams, N., Karniadakis, G.E.: Physics-informed neural networks for inverse problems in supersonic flows. Journal of Computational Physics 466, 111402 (2022) Pun et al. [2019] Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  8. Fang, Q., Mou, X., Li, S.: A physics-informed neural network based on mixed data sampling for solving modified diffusion equations. Scientific Reports 13(1), 2491 (2023) Jagtap et al. [2022] Jagtap, A.D., Mao, Z., Adams, N., Karniadakis, G.E.: Physics-informed neural networks for inverse problems in supersonic flows. Journal of Computational Physics 466, 111402 (2022) Pun et al. [2019] Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Mao, Z., Adams, N., Karniadakis, G.E.: Physics-informed neural networks for inverse problems in supersonic flows. Journal of Computational Physics 466, 111402 (2022) Pun et al. [2019] Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  9. Jagtap, A.D., Mao, Z., Adams, N., Karniadakis, G.E.: Physics-informed neural networks for inverse problems in supersonic flows. Journal of Computational Physics 466, 111402 (2022) Pun et al. [2019] Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  10. Pun, G.P., Batra, R., Ramprasad, R., Mishin, Y.: Physically informed artificial neural networks for atomistic modeling of materials. Nature communications 10(1), 2339 (2019) Lotfollahi et al. [2023] Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  11. Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A.V., Theis, F.J.: Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology 25(2), 337–350 (2023) Kozuch et al. [2018] Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  12. Kozuch, D.J., Stillinger, F.H., Debenedetti, P.G.: Combined molecular dynamics and neural network method for predicting protein antifreeze activity. Proceedings of the National Academy of Sciences 115(52), 13252–13257 (2018) Coin et al. [2003] Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  13. Coin, L., Bateman, A., Durbin, R.: Enhanced protein domain discovery by using language modeling techniques from speech recognition. Proceedings of the National Academy of Sciences 100(8), 4516–4520 (2003) Curtarolo et al. [2013] Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  14. Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) Butler et al. [2018] Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  15. Butler, K.T., Davies, D.W., Cartwright, H., Isayev, O., Walsh, A.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018) Hart et al. [2021] Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  16. Hart, G.L., Mueller, T., Toher, C., Curtarolo, S.: Machine learning for alloys. Nature Reviews Materials 6(8), 730–755 (2021) Shi et al. [2019] Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  17. Shi, Z., Tsymbalov, E., Dao, M., Suresh, S., Shapeev, A., Li, J.: Deep elastic strain engineering of bandgap through machine learning. Proceedings of the National Academy of Sciences 116(10), 4117–4122 (2019) Lee et al. [2017] Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  18. Lee, W.K., Yu, S., Engel, C.J., Reese, T., Rhee, D., Chen, W., Odom, T.W.: Concurrent design of quasi-random photonic nanostructures. Proc Natl Acad Sci U S A 114(33), 8734–8739 (2017) https://doi.org/10.1073/pnas.1704711114 Gebru et al. [2017] Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  19. Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences 114(50), 13108–13113 (2017) Lu et al. [2020] Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  20. Lu, L., Dao, M., Kumar, P., Ramamurty, U., Karniadakis, G.E., Suresh, S.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc Natl Acad Sci U S A 117(13), 7052–7062 (2020) https://doi.org/10.1073/pnas.1922210117 Wang et al. [2022] Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  21. Wang, H., Planas, R., Chandramowlishwaran, A., Bostanabad, R.: Mosaic flows: A transferable deep learning framework for solving pdes on unseen domains. Computer Methods in Applied Mechanics and Engineering 389, 114424 (2022) https://doi.org/10.1016/j.cma.2021.114424 von Saldern et al. [2022] Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  22. Saldern, J.G., Reumschüssel, J.M., Kaiser, T.L., Sieber, M., Oberleithner, K.: Mean flow data assimilation based on physics-informed neural networks. Physics of Fluids 34(11) (2022) Karniadakis et al. [2021] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  23. Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics 3(6), 422–440 (2021) https://doi.org/10.1038/s42254-021-00314-5 Djeridane and Lygeros [2006] Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  24. Djeridane, B., Lygeros, J.: Neural approximation of pde solutions: An application to reachability computations. In: Proceedings of the 45th IEEE Conference on Decision and Control, pp. 3034–3039 (2006). IEEE Lagaris et al. [1998] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  25. Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9(5), 987–1000 (1998) Raissi et al. [2019] Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  26. Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019) Sirignano and Spiliopoulos [2018] Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  27. Sirignano, J., Spiliopoulos, K.: Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics 375, 1339–1364 (2018) McClenny and Braga-Neto [2020] McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  28. McClenny, L., Braga-Neto, U.: Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544 (2020) Jagtap et al. [2020] Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  29. Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E.: Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404, 109136 (2020) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  30. Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Bu and Karpatne [2021] Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  31. Bu, J., Karpatne, A.: Quadratic residual networks: A new class of neural networks for solving forward and inverse problems in physics involving pdes. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 675–683 (2021). SIAM Gao et al. [2021] Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  32. Gao, H., Sun, L., Wang, J.-X.: Phygeonet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state pdes on irregular domain. Journal of Computational Physics 428, 110079 (2021) Jagtap and Karniadakis [2021] Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  33. Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (xpinns): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium: MLPS, vol. 10 (2021) Baydin et al. [2018] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  34. Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research 18, 1–43 (2018) Chen et al. [2018] Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  35. Chen, Z., Badrinarayanan, V., Lee, C.-Y., Rabinovich, A.: Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: International Conference on Machine Learning, pp. 794–803 (2018). PMLR van der Meer et al. [2022] Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  36. Meer, R., Oosterlee, C.W., Borovykh, A.: Optimally weighted loss functions for solving pdes with neural networks. Journal of Computational and Applied Mathematics 405, 113887 (2022) Wang et al. [2021] Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  37. Wang, S., Teng, Y., Perdikaris, P.: Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing 43(5), 3055–3081 (2021) Lagari et al. [2020] Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  38. Lagari, P.L., Tsoukalas, L.H., Safarkhani, S., Lagaris, I.E.: Systematic construction of neural forms for solving partial differential equations inside rectangular domains, subject to initial, boundary and interface conditions. International Journal on Artificial Intelligence Tools 29(05), 2050009 (2020) Dong and Ni [2021] Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  39. Dong, S., Ni, N.: A method for representing periodic functions and enforcing exactly periodic boundary conditions with deep neural networks. Journal of Computational Physics 435, 110242 (2021) McFall and Mahan [2009] McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  40. McFall, K.S., Mahan, J.R.: Artificial neural network method for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE Transactions on Neural Networks 20(8), 1221–1233 (2009) Berg and Nyström [2018] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  41. Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, 28–41 (2018) Owhadi et al. [2019] Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  42. Owhadi, H., Scovel, C., Schäfer, F.: Statistical numerical approximation. Notices of the AMS (2019) Zhang et al. [2022] Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  43. Zhang, J., Zhang, S., Lin, G.: Pagp: A physics-assisted gaussian process framework with active learning for forward and inverse problems of partial differential equations. arXiv preprint arXiv:2204.02583 (2022) Iwata and Ghahramani [2017] Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  44. Iwata, T., Ghahramani, Z.: Improving output uncertainty estimation and generalization in deep learning via neural network gaussian processes. arXiv preprint arXiv:1707.05922 (2017) Meng and Yang [2023] Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  45. Meng, R., Yang, X.: Sparse gaussian processes for solving nonlinear pdes. Journal of Computational Physics 490, 112340 (2023) Chen et al. [2021] Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  46. Chen, Y., Hosseini, B., Owhadi, H., Stuart, A.M.: Solving and learning nonlinear pdes with gaussian processes. Journal of Computational Physics 447, 110668 (2021) Rasmussen [2006] Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  47. Rasmussen, C.E.: Gaussian Processes for Machine Learning, (2006) Gardner et al. [2018] Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  48. Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D., Wilson, A.G.: Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems 31 (2018) Wilson et al. [2016] Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  49. Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378. PMLR, ??? (2016) Kingma and Ba [2014] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  50. Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) Liu and Nocedal [1989] Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  51. Liu, D.C., Nocedal, J.: On the limited memory bfgs method for large scale optimization. Mathematical programming 45(1-3), 503–528 (1989) Sun et al. [2019] Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  52. Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE transactions on cybernetics 50(8), 3668–3681 (2019) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  53. Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Ohwada [2009] Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  54. Ohwada, T.: Cole-hopf transformation as numerical tool for the burgers equation. Appl. Comput. Math 8(1), 107–113 (2009) Multiphysics [1998] Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  55. Multiphysics, C.: Introduction to comsol multiphysics®. COMSOL Multiphysics, Burlington, MA, accessed Feb 9(2018), 32 (1998) Schölkopf and Smola [2002] Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  56. Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT press, ??? (2002) O’sullivan et al. [1986] O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  57. O’sullivan, F., Yandell, B.S., Raynor Jr, W.J.: Automatic smoothing of regression functions in generalized linear models. Journal of the American Statistical Association 81(393), 96–103 (1986) Kimeldorf and Wahba [1971] Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  58. Kimeldorf, G., Wahba, G.: Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications 33(1), 82–95 (1971) Szeliski [1987] Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  59. Szeliski, R.: Regularization uses fractal priors. In: Proceedings of the Sixth National Conference on Artificial intelligence-Volume 2, pp. 749–754 (1987) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  60. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
Citations (1)

Summary

We haven't generated a summary for this paper yet.